We propose a novel approach for unsupervised zero-shot learning (ZSL) ofclasses based on their names. Most existing unsupervised ZSL methods aim tolearn a model for directly comparing image features and class names. However,this proves to be a difficult task due to dominance of non-visual semantics inunderlying vector-space embeddings of class names. To address this issue, wediscriminatively learn a word representation such that the similarities betweenclass and combination of attribute names fall in line with the visualsimilarity. Contrary to the traditional zero-shot learning approaches that arebuilt upon attribute presence, our approach bypasses the laboriousattribute-class relation annotations for unseen classes. In addition, ourproposed approach renders text-only training possible, hence, the training canbe augmented without the need to collect additional image data. Theexperimental results show that our method yields state-of-the-art results forunsupervised ZSL in three benchmark datasets.
展开▼